GradQuant: Introduction to Network Analysis in igraph

Network Analysis Introduction

What is a network?

A network graph depicts interconnections between nodes.

The presence or absence of each interconnection indicates whether there exists some form of connection between each pair of nodes.

Examples

  • Google Maps
  • Friendships between individuals
  • Flight routes between cities
  • Connections between brain regions.
  • Communications networks, telephone, satellite, cable and internet
  • Discrete words/phrases and their linguistic relations
  • Atomic or molecular structures

Terms

Nodes:

Install and Load Relevant Packages

# install.packages("igraph") # For network analysis
# install.packages("igraphdata") # For sample datasets
# install.packages("RColorBrewer") # For colors
library(igraph)
library(igraphdata)
library(RColorBrewer)
data("UKfaculty")

Structures

Below is a randomly created adjacency matrix. ‘1’ denotes an edge (i.e., connection), and ‘0’ denotes the absence of an edge

set.seed(52)
pilotAdj <- sapply(1:10, function(x) sample(c(0,1),10,replace=T))
diag(pilotAdj) <- 0 # No self-connections
print(pilotAdj)
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    0    0    0    1    0    1    0    0    1     0
 [2,]    1    0    1    0    1    0    1    1    0     0
 [3,]    0    0    0    0    1    1    0    1    1     0
 [4,]    0    1    0    0    0    0    0    0    0     1
 [5,]    1    0    1    0    0    0    0    1    1     0
 [6,]    0    1    0    0    1    0    1    0    1     0
 [7,]    0    1    1    0    1    0    0    1    0     0
 [8,]    1    1    1    1    0    1    0    0    0     0
 [9,]    0    1    0    1    1    1    1    0    0     1
[10,]    1    1    0    0    1    1    0    0    1     0

This may seem abstract so let’s add some labels to the nodes to make it more meaningful:

people <- c("Luke","Anne","Zee","Bob","Tim","Jazmin","Juan","Kalani","Liz","Brent")
colnames(pilotAdj) <- people
rownames(pilotAdj) <- people
pilotAdj
       Luke Anne Zee Bob Tim Jazmin Juan Kalani Liz Brent
Luke      0    0   0   1   0      1    0      0   1     0
Anne      1    0   1   0   1      0    1      1   0     0
Zee       0    0   0   0   1      1    0      1   1     0
Bob       0    1   0   0   0      0    0      0   0     1
Tim       1    0   1   0   0      0    0      1   1     0
Jazmin    0    1   0   0   1      0    1      0   1     0
Juan      0    1   1   0   1      0    0      1   0     0
Kalani    1    1   1   1   0      1    0      0   0     0
Liz       0    1   0   1   1      1    1      0   0     1
Brent     1    1   0   0   1      1    0      0   1     0

Adjacency Matrix

In the adjacency matrix representation, a graph is represented in the form of a two-dimensional array. The size of the array is V x V, where V is the set of vertices. The following image represents the adjacency matrix representation:

Converting an Adjacency Matrix to a Graph

An Adjacency list is an array consisting of the address of all the linked lists. The first node of the linked list represents the vertex and the remaining lists connected to this node represents the vertices to which this node is connected. This representation can also be used to represent a weighted graph. The linked list can slightly be changed to even store the weight of the edge.

pilotGraph <- graph_from_adjacency_matrix(pilotAdj)
pilotGraph
IGRAPH 8e6f538 DN-- 10 42 -- 
+ attr: name (v/c)
+ edges from 8e6f538 (vertex names):
 [1] Luke  ->Bob    Luke  ->Jazmin Luke  ->Liz    Anne  ->Luke   Anne  ->Zee   
 [6] Anne  ->Tim    Anne  ->Juan   Anne  ->Kalani Zee   ->Tim    Zee   ->Jazmin
[11] Zee   ->Kalani Zee   ->Liz    Bob   ->Anne   Bob   ->Brent  Tim   ->Luke  
[16] Tim   ->Zee    Tim   ->Kalani Tim   ->Liz    Jazmin->Anne   Jazmin->Tim   
[21] Jazmin->Juan   Jazmin->Liz    Juan  ->Anne   Juan  ->Zee    Juan  ->Tim   
[26] Juan  ->Kalani Kalani->Luke   Kalani->Anne   Kalani->Zee    Kalani->Bob   
[31] Kalani->Jazmin Liz   ->Anne   Liz   ->Bob    Liz   ->Tim    Liz   ->Jazmin
[36] Liz   ->Juan   Liz   ->Brent  Brent ->Luke   Brent ->Anne   Brent ->Tim   
+ ... omitted several edges

Converting an igraph to an edgelist

An edge list is a list or array of all the edges in a graph. Edge lists are one of the easier representations of a graph. In this implementation, the underlying data structure for keeping track of all the nodes and edges is a single list of pairs. Each pair represents a single edge and is comprised of the two unique IDs of the nodes involved. Each line/edge in the graph gets an entry in the edge list, and that single data structure then encodes all nodes and relationships.

pilotEList <- as_edgelist(pilotGraph)
pilotEList
      [,1]     [,2]    
 [1,] "Luke"   "Bob"   
 [2,] "Luke"   "Jazmin"
 [3,] "Luke"   "Liz"   
 [4,] "Anne"   "Luke"  
 [5,] "Anne"   "Zee"   
 [6,] "Anne"   "Tim"   
 [7,] "Anne"   "Juan"  
 [8,] "Anne"   "Kalani"
 [9,] "Zee"    "Tim"   
[10,] "Zee"    "Jazmin"
[11,] "Zee"    "Kalani"
[12,] "Zee"    "Liz"   
[13,] "Bob"    "Anne"  
[14,] "Bob"    "Brent" 
[15,] "Tim"    "Luke"  
[16,] "Tim"    "Zee"   
[17,] "Tim"    "Kalani"
[18,] "Tim"    "Liz"   
[19,] "Jazmin" "Anne"  
[20,] "Jazmin" "Tim"   
[21,] "Jazmin" "Juan"  
[22,] "Jazmin" "Liz"   
[23,] "Juan"   "Anne"  
[24,] "Juan"   "Zee"   
[25,] "Juan"   "Tim"   
[26,] "Juan"   "Kalani"
[27,] "Kalani" "Luke"  
[28,] "Kalani" "Anne"  
[29,] "Kalani" "Zee"   
[30,] "Kalani" "Bob"   
[31,] "Kalani" "Jazmin"
[32,] "Liz"    "Anne"  
[33,] "Liz"    "Bob"   
[34,] "Liz"    "Tim"   
[35,] "Liz"    "Jazmin"
[36,] "Liz"    "Juan"  
[37,] "Liz"    "Brent" 
[38,] "Brent"  "Luke"  
[39,] "Brent"  "Anne"  
[40,] "Brent"  "Tim"   
[41,] "Brent"  "Jazmin"
[42,] "Brent"  "Liz"   

Adjacency List

In the adjacency list representation, a graph is represented as an array of linked list. The index of the array represents a vertex and each element in its linked list represents the  vertices that form an edge with the vertex. The following image represents the adjacency list representation: 

Converting to adjacency list

as_adj_list(pilotGraph)
$Luke
+ 7/10 vertices, named, from 8e6f538:
[1] Anne   Bob    Tim    Jazmin Kalani Liz    Brent 

$Anne
+ 11/10 vertices, named, from 8e6f538:
 [1] Luke   Zee    Bob    Tim    Jazmin Juan   Juan   Kalani Kalani Liz   
[11] Brent 

$Zee
+ 8/10 vertices, named, from 8e6f538:
[1] Anne   Tim    Tim    Jazmin Juan   Kalani Kalani Liz   

$Bob
+ 5/10 vertices, named, from 8e6f538:
[1] Luke   Anne   Kalani Liz    Brent 

$Tim
+ 10/10 vertices, named, from 8e6f538:
 [1] Luke   Anne   Zee    Zee    Jazmin Juan   Kalani Liz    Liz    Brent 

$Jazmin
+ 9/10 vertices, named, from 8e6f538:
[1] Luke   Anne   Zee    Tim    Juan   Kalani Liz    Liz    Brent 

$Juan
+ 7/10 vertices, named, from 8e6f538:
[1] Anne   Anne   Zee    Tim    Jazmin Kalani Liz   

$Kalani
+ 9/10 vertices, named, from 8e6f538:
[1] Luke   Anne   Anne   Zee    Zee    Bob    Tim    Jazmin Juan  

$Liz
+ 11/10 vertices, named, from 8e6f538:
 [1] Luke   Anne   Zee    Bob    Tim    Tim    Jazmin Jazmin Juan   Brent 
[11] Brent 

$Brent
+ 7/10 vertices, named, from 8e6f538:
[1] Luke   Anne   Bob    Tim    Jazmin Liz    Liz   

Converting igraph back to an adjacency matrix

Notably, you might see that an adjacency matrix is a “sparse matrix” and is distinct from the base R class of “matrix”. To convert it, you will need to go a step further.

pilotAdj <- as_adjacency_matrix(pilotGraph)
pilotAdj
10 x 10 sparse Matrix of class "dgCMatrix"
                          
Luke   . . . 1 . 1 . . 1 .
Anne   1 . 1 . 1 . 1 1 . .
Zee    . . . . 1 1 . 1 1 .
Bob    . 1 . . . . . . . 1
Tim    1 . 1 . . . . 1 1 .
Jazmin . 1 . . 1 . 1 . 1 .
Juan   . 1 1 . 1 . . 1 . .
Kalani 1 1 1 1 . 1 . . . .
Liz    . 1 . 1 1 1 1 . . 1
Brent  1 1 . . 1 1 . . 1 .

Converting sparse matrix to base matrix

pilotBaseM <- as.matrix(pilotAdj)
pilotBaseM
       Luke Anne Zee Bob Tim Jazmin Juan Kalani Liz Brent
Luke      0    0   0   1   0      1    0      0   1     0
Anne      1    0   1   0   1      0    1      1   0     0
Zee       0    0   0   0   1      1    0      1   1     0
Bob       0    1   0   0   0      0    0      0   0     1
Tim       1    0   1   0   0      0    0      1   1     0
Jazmin    0    1   0   0   1      0    1      0   1     0
Juan      0    1   1   0   1      0    0      1   0     0
Kalani    1    1   1   1   0      1    0      0   0     0
Liz       0    1   0   1   1      1    1      0   0     1
Brent     1    1   0   0   1      1    0      0   1     0

Simple Inspection

Show Vertices

V(pilotGraph)
+ 10/10 vertices, named, from 8e6f538:
 [1] Luke   Anne   Zee    Bob    Tim    Jazmin Juan   Kalani Liz    Brent 

Show Edges

E(pilotGraph)
+ 42/42 edges from 8e6f538 (vertex names):
 [1] Luke  ->Bob    Luke  ->Jazmin Luke  ->Liz    Anne  ->Luke   Anne  ->Zee   
 [6] Anne  ->Tim    Anne  ->Juan   Anne  ->Kalani Zee   ->Tim    Zee   ->Jazmin
[11] Zee   ->Kalani Zee   ->Liz    Bob   ->Anne   Bob   ->Brent  Tim   ->Luke  
[16] Tim   ->Zee    Tim   ->Kalani Tim   ->Liz    Jazmin->Anne   Jazmin->Tim   
[21] Jazmin->Juan   Jazmin->Liz    Juan  ->Anne   Juan  ->Zee    Juan  ->Tim   
[26] Juan  ->Kalani Kalani->Luke   Kalani->Anne   Kalani->Zee    Kalani->Bob   
[31] Kalani->Jazmin Liz   ->Anne   Liz   ->Bob    Liz   ->Tim    Liz   ->Jazmin
[36] Liz   ->Juan   Liz   ->Brent  Brent ->Luke   Brent ->Anne   Brent ->Tim   
[41] Brent ->Jazmin Brent ->Liz   

Total Vertices

gorder(pilotGraph)
[1] 10

Total Edges

gsize(pilotGraph)
[1] 42

Basic Plot

plot(pilotGraph)

What is a Directed/Undirected Graph

  • Directed Graph: Directed graphs are a class of graphs that don’t presume symmetry or reciprocity in the edges established between vertices. In a directed graph, if a and b are two vertices connected by an edge (a,b), this doesn’t necessarily mean that an edge connecting (b,a) also exists

  • Undirected Graph: Symmetric; Undirected graphs are more specific. For them, there’s an extra assumption regarding the reciprocity in the relationship between pairs of vertices connected by an edge. If an edge (a,b) exists between two vertices a and b, the edge (b,a) also exists

Examples

Examples of directed graphs include parents and their offspring; Twitter

Examples of undirected graphs include pedestrian pathways, where movement between any two intersections of paths is possible in both directions; railways; Facebook; correlation matrices

Depiction of Directed Graph

Our prior adjacency matrix we generated is a directed graph.

print(pilotAdj)
10 x 10 sparse Matrix of class "dgCMatrix"
                          
Luke   . . . 1 . 1 . . 1 .
Anne   1 . 1 . 1 . 1 1 . .
Zee    . . . . 1 1 . 1 1 .
Bob    . 1 . . . . . . . 1
Tim    1 . 1 . . . . 1 1 .
Jazmin . 1 . . 1 . 1 . 1 .
Juan   . 1 1 . 1 . . 1 . .
Kalani 1 1 1 1 . 1 . . . .
Liz    . 1 . 1 1 1 1 . . 1
Brent  1 1 . . 1 1 . . 1 .

Depiction of Undirected Graph

set.seed(48)
pilotAdjUn <- sapply(1:10, function(x) sample(c(0,1),10,replace=T))
pilotAdjUn[lower.tri(pilotAdjUn)] = t(pilotAdjUn)[lower.tri(pilotAdjUn)]
diag(pilotAdjUn) <- 0
pilotAdjUn
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    0    0    0    1    1    0    1    1    0     0
 [2,]    0    0    1    1    0    0    1    0    0     0
 [3,]    0    1    0    1    1    0    1    1    0     1
 [4,]    1    1    1    0    0    0    1    1    1     0
 [5,]    1    0    1    0    0    0    1    0    0     0
 [6,]    0    0    0    0    0    0    1    0    0     0
 [7,]    1    1    1    1    1    1    0    1    1     1
 [8,]    1    0    1    1    0    0    1    0    1     0
 [9,]    0    0    0    1    0    0    1    1    0     0
[10,]    0    0    1    0    0    0    1    0    0     0
pilotGraphUn <- graph_from_adjacency_matrix(pilotAdjUn, mode="undirected")
is.directed(pilotGraphUn)
[1] FALSE

What is a weighted/unweighted graph

Attributes

Vertex Attributes

partyAtts <- sample(c("Dem","Rep"), gorder(pilotGraph) ,replace = T) # Random attributes
partyAtts # Inspect
 [1] "Rep" "Dem" "Rep" "Rep" "Rep" "Rep" "Dem" "Rep" "Dem" "Dem"
pilotGraph <-set_vertex_attr(pilotGraph, "party", value = partyAtts) # Assign attributes

Check Vertex Attributes

vertex_attr(pilotGraph) # Check vertex attributes
$name
 [1] "Luke"   "Anne"   "Zee"    "Bob"    "Tim"    "Jazmin" "Juan"   "Kalani"
 [9] "Liz"    "Brent" 

$party
 [1] "Rep" "Dem" "Rep" "Rep" "Rep" "Rep" "Dem" "Rep" "Dem" "Dem"
V(pilotGraph)$party # or check specific vertex attribute
 [1] "Rep" "Dem" "Rep" "Rep" "Rep" "Rep" "Dem" "Rep" "Dem" "Dem"

Edge Attributes

friendshipAtts <- rnorm(gsize(pilotGraph))
friendshipAtts
 [1]  0.0003886532 -0.9084064952  0.7297462996  1.3719967855 -0.2095287169
 [6]  0.0487437067 -0.4387685857 -0.1643273084  1.9826443201 -1.0800525952
[11]  0.3165564912 -1.0852255825 -0.8968359306 -0.4142618756 -0.3524979177
[16] -0.6337354192 -0.0719193894 -0.7048606787  1.8262535280 -1.3985064735
[21] -1.1625890724 -0.8335206467  0.1487729735  1.2561356477 -0.9197363463
[26] -0.5573351508  0.9715357795  0.9341367607  0.2990071515 -0.7463273171
[31] -0.3663144421 -0.0704711686 -1.2666863813  0.8853194276  0.1243266197
[36]  0.6058275304  1.5574154511  0.4231836437 -0.8765581216  1.0745972898
[41]  0.0741053088  0.9017606859
pilotGraph <-set_edge_attr(pilotGraph, "friendship", value = friendshipAtts) # Assign attributes

Check Attributes

edge_attr(pilotGraph) # Check vertex attributes
$friendship
 [1]  0.0003886532 -0.9084064952  0.7297462996  1.3719967855 -0.2095287169
 [6]  0.0487437067 -0.4387685857 -0.1643273084  1.9826443201 -1.0800525952
[11]  0.3165564912 -1.0852255825 -0.8968359306 -0.4142618756 -0.3524979177
[16] -0.6337354192 -0.0719193894 -0.7048606787  1.8262535280 -1.3985064735
[21] -1.1625890724 -0.8335206467  0.1487729735  1.2561356477 -0.9197363463
[26] -0.5573351508  0.9715357795  0.9341367607  0.2990071515 -0.7463273171
[31] -0.3663144421 -0.0704711686 -1.2666863813  0.8853194276  0.1243266197
[36]  0.6058275304  1.5574154511  0.4231836437 -0.8765581216  1.0745972898
[41]  0.0741053088  0.9017606859
E(pilotGraph)$friendship # Check specific edge attributes
 [1]  0.0003886532 -0.9084064952  0.7297462996  1.3719967855 -0.2095287169
 [6]  0.0487437067 -0.4387685857 -0.1643273084  1.9826443201 -1.0800525952
[11]  0.3165564912 -1.0852255825 -0.8968359306 -0.4142618756 -0.3524979177
[16] -0.6337354192 -0.0719193894 -0.7048606787  1.8262535280 -1.3985064735
[21] -1.1625890724 -0.8335206467  0.1487729735  1.2561356477 -0.9197363463
[26] -0.5573351508  0.9715357795  0.9341367607  0.2990071515 -0.7463273171
[31] -0.3663144421 -0.0704711686 -1.2666863813  0.8853194276  0.1243266197
[36]  0.6058275304  1.5574154511  0.4231836437 -0.8765581216  1.0745972898
[41]  0.0741053088  0.9017606859

Subsetting Graph

We may want to subset all nodes that are “Dem”:

V(pilotGraph)[party=="Dem"]
+ 4/10 vertices, named, from 8e6f538:
[1] Anne  Juan  Liz   Brent

We may want to create a subsettted graph of only “Dem”:

induced_subgraph(pilotGraph, vids=which(V(pilotGraph)$party=="Dem") )
IGRAPH 63d71a1 DN-- 4 7 -- 
+ attr: name (v/c), party (v/c), friendship (e/n)
+ edges from 63d71a1 (vertex names):
[1] Anne ->Juan  Juan ->Anne  Liz  ->Anne  Liz  ->Juan  Liz  ->Brent
[6] Brent->Anne  Brent->Liz  

We may want to subset all of the edges that include “Anne”:

E(pilotGraph)[[inc("Anne")]]
+ 11/42 edges from 8e6f538 (vertex names):
     tail   head tid hid  friendship
4    Anne   Luke   2   1  1.37199679
5    Anne    Zee   2   3 -0.20952872
6    Anne    Tim   2   5  0.04874371
7    Anne   Juan   2   7 -0.43876859
8    Anne Kalani   2   8 -0.16432731
13    Bob   Anne   4   2 -0.89683593
19 Jazmin   Anne   6   2  1.82625353
23   Juan   Anne   7   2  0.14877297
28 Kalani   Anne   8   2  0.93413676
32    Liz   Anne   9   2 -0.07047117
39  Brent   Anne  10   2 -0.87655812

We may want to subset all edges with a friendship greater than 0:

E(pilotGraph)[[friendship>0]]
+ 20/42 edges from 8e6f538 (vertex names):
     tail   head tid hid   friendship
1    Luke    Bob   1   4 0.0003886532
3    Luke    Liz   1   9 0.7297462996
4    Anne   Luke   2   1 1.3719967855
6    Anne    Tim   2   5 0.0487437067
9     Zee    Tim   3   5 1.9826443201
11    Zee Kalani   3   8 0.3165564912
19 Jazmin   Anne   6   2 1.8262535280
23   Juan   Anne   7   2 0.1487729735
24   Juan    Zee   7   3 1.2561356477
27 Kalani   Luke   8   1 0.9715357795
28 Kalani   Anne   8   2 0.9341367607
29 Kalani    Zee   8   3 0.2990071515
34    Liz    Tim   9   5 0.8853194276
35    Liz Jazmin   9   6 0.1243266197
36    Liz   Juan   9   7 0.6058275304
37    Liz  Brent   9  10 1.5574154511
38  Brent   Luke  10   1 0.4231836437
40  Brent    Tim  10   5 1.0745972898
41  Brent Jazmin  10   6 0.0741053088
42  Brent    Liz  10   9 0.9017606859

Head of nth edge

Starting vertex of all edges

head_of(pilotGraph, E(pilotGraph))
+ 42/10 vertices, named, from 8e6f538:
 [1] Bob    Jazmin Liz    Luke   Zee    Tim    Juan   Kalani Tim    Jazmin
[11] Kalani Liz    Anne   Brent  Luke   Zee    Kalani Liz    Anne   Tim   
[21] Juan   Liz    Anne   Zee    Tim    Kalani Luke   Anne   Zee    Bob   
[31] Jazmin Anne   Bob    Tim    Jazmin Juan   Brent  Luke   Anne   Tim   
[41] Jazmin Liz   

Incident

Any edges to or from a given vertex... Here, for “Anne”

incident(pilotGraph, "Anne", mode=c("all"))
+ 11/42 edges from 8e6f538 (vertex names):
 [1] Anne  ->Luke   Anne  ->Zee    Bob   ->Anne   Anne  ->Tim    Jazmin->Anne  
 [6] Anne  ->Juan   Juan  ->Anne   Anne  ->Kalani Kalani->Anne   Liz   ->Anne  
[11] Brent ->Anne  

Neighbors

We can subset any nodes that are neighboring a node.

neighbors(pilotGraph, "Anne", mode=c("all"))
+ 11/10 vertices, named, from 8e6f538:
 [1] Luke   Zee    Bob    Tim    Jazmin Juan   Juan   Kalani Kalani Liz   
[11] Brent 

What friends do Anne and Liz share in common?

We can use intersection to discover this.

afriends <- neighbors(pilotGraph, "Anne", mode=c("out"))
lfriends <- neighbors(pilotGraph, "Anne", mode=c("out"))
intersection(afriends, lfriends)
+ 5/10 vertices, named, from 8e6f538:
[1] Luke   Zee    Tim    Juan   Kalani

Nodes in N steps…

ego(pilotGraph, order=1, mindist = 0, nodes = "Bob") # 1 step away
[[1]]
+ 6/10 vertices, named, from 8e6f538:
[1] Bob    Luke   Anne   Kalani Liz    Brent 
ego(pilotGraph, order=1, mindist = 1, nodes = "Bob") # 1 step away, not including Bob which is identical to neighbors() output
[[1]]
+ 5/10 vertices, named, from 8e6f538:
[1] Luke   Anne   Kalani Liz    Brent 
ego(pilotGraph, order=2, mindist = 0, nodes = "Bob") # 2 steps away, not including Bob
[[1]]
+ 10/10 vertices, named, from 8e6f538:
 [1] Bob    Luke   Anne   Kalani Liz    Brent  Tim    Jazmin Zee    Juan  

Relationships and Size

What are the two farthest apart nodes?

The distance between these two nodes reflect the diameter of the network?

farthest_vertices(pilotGraph)
$vertices
+ 2/10 vertices, named, from 8e6f538:
[1] Luke Zee 

$distance
[1] 3

What is the diameter?

As you can see, the diameter returns the same distance as is observed in farthest vertices.

diameter(pilotGraph)
[1] 3

Distance of all nodes

You can extract a table of all distances among all nodes as well.

distances(pilotGraph)
       Luke Anne Zee Bob Tim Jazmin Juan Kalani Liz Brent
Luke      0    1   2   1   1      1    2      1   1     1
Anne      1    0   1   1   1      1    1      1   1     1
Zee       2    1   0   2   1      1    1      1   1     2
Bob       1    1   2   0   2      2    2      1   1     1
Tim       1    1   1   2   0      1    1      1   1     1
Jazmin    1    1   1   2   1      0    1      1   1     1
Juan      2    1   1   2   1      1    0      1   1     2
Kalani    1    1   1   1   1      1    1      0   2     2
Liz       1    1   1   1   1      1    1      2   0     1
Brent     1    1   2   1   1      1    2      2   1     0

Similarity

The vertex similarity coefficient of a pair of vertices is a measure of how similar are the immediate networks of those two vertices. Imagine we have two vertices V1 and V2 in an unweighted graph G. There are three common ways to calculate the vertex similarity of V1 and V2.

Jaccard Similarity

The Jaccard similarity coefficient is the number of vertices who are neighbors of both V1 and V2 divided by the number of vertices who are neighbors of at least one of V1 and V2.

jmat <- similarity.jaccard(pilotGraph)
jmat
       [,1] [,2]      [,3]  [,4]      [,5]      [,6]      [,7]      [,8]
 [1,] 1.000  0.6 0.6250000 0.500 0.5000000 0.5000000 0.6250000 0.4000000
 [2,] 0.600  1.0 0.5000000 0.400 0.7000000 0.7000000 0.5000000 0.6000000
 [3,] 0.625  0.5 1.0000000 0.375 0.5555556 0.5555556 0.7142857 0.4444444
 [4,] 0.500  0.4 0.3750000 1.000 0.6250000 0.6250000 0.3750000 0.2000000
 [5,] 0.500  0.7 0.5555556 0.625 1.0000000 0.7777778 0.5555556 0.5000000
 [6,] 0.500  0.7 0.5555556 0.625 0.7777778 1.0000000 0.5555556 0.5000000
 [7,] 0.625  0.5 0.7142857 0.375 0.5555556 0.5555556 1.0000000 0.4444444
 [8,] 0.400  0.6 0.4444444 0.200 0.5000000 0.5000000 0.4444444 1.0000000
 [9,] 0.500  0.7 0.4000000 0.300 0.6000000 0.6000000 0.4000000 0.8750000
[10,] 0.625  0.5 0.5000000 0.375 0.4000000 0.4000000 0.5000000 0.6250000
           [,9]     [,10]
 [1,] 0.5000000 0.6250000
 [2,] 0.7000000 0.5000000
 [3,] 0.4000000 0.5000000
 [4,] 0.3000000 0.3750000
 [5,] 0.6000000 0.4000000
 [6,] 0.6000000 0.4000000
 [7,] 0.4000000 0.5000000
 [8,] 0.8750000 0.6250000
 [9,] 1.0000000 0.5555556
[10,] 0.5555556 1.0000000

Dice Similarity

The dice similarity coefficient is twice the number of vertices who are neighbors of both v1�1 and v2�2 divided by the sum of the degree centralities of v1�1 and v2�2. Thus, common neighbors are double counted in this method.

dmat<-similarity.dice(pilotGraph)
dmat
           [,1]      [,2]      [,3]      [,4]      [,5]      [,6]      [,7]
 [1,] 1.0000000 0.7500000 0.7692308 0.6666667 0.6666667 0.6666667 0.7692308
 [2,] 0.7500000 1.0000000 0.6666667 0.5714286 0.8235294 0.8235294 0.6666667
 [3,] 0.7692308 0.6666667 1.0000000 0.5454545 0.7142857 0.7142857 0.8333333
 [4,] 0.6666667 0.5714286 0.5454545 1.0000000 0.7692308 0.7692308 0.5454545
 [5,] 0.6666667 0.8235294 0.7142857 0.7692308 1.0000000 0.8750000 0.7142857
 [6,] 0.6666667 0.8235294 0.7142857 0.7692308 0.8750000 1.0000000 0.7142857
 [7,] 0.7692308 0.6666667 0.8333333 0.5454545 0.7142857 0.7142857 1.0000000
 [8,] 0.5714286 0.7500000 0.6153846 0.3333333 0.6666667 0.6666667 0.6153846
 [9,] 0.6666667 0.8235294 0.5714286 0.4615385 0.7500000 0.7500000 0.5714286
[10,] 0.7692308 0.6666667 0.6666667 0.5454545 0.5714286 0.5714286 0.6666667
           [,8]      [,9]     [,10]
 [1,] 0.5714286 0.6666667 0.7692308
 [2,] 0.7500000 0.8235294 0.6666667
 [3,] 0.6153846 0.5714286 0.6666667
 [4,] 0.3333333 0.4615385 0.5454545
 [5,] 0.6666667 0.7500000 0.5714286
 [6,] 0.6666667 0.7500000 0.5714286
 [7,] 0.6153846 0.5714286 0.6666667
 [8,] 1.0000000 0.9333333 0.7692308
 [9,] 0.9333333 1.0000000 0.7142857
[10,] 0.7692308 0.7142857 1.0000000

Inverse Log Weighted Similarity

The inverse log-weighted similarity coefficient is the sum of the inverse logs of the degrees of the common neighbors of V1 and V2. This definition asserts that common neighbors that have high degree in the network are 'less valuable' in detecting similarity because there is a higher likelihood that they would be a common neighbor simply by chance.

ilwmat<-similarity.invlogweighted(pilotGraph)
ilwmat
          [,1]     [,2]     [,3]     [,4]     [,5]      [,6]      [,7]     [,8]
 [1,] 0.000000 3.351919 3.068013 1.803083 2.675235 2.6544096 2.5956309 2.344814
 [2,] 3.351919 1.938036 4.589016 2.355068 5.216814 4.7150902 2.6975841 4.014241
 [3,] 3.068013 4.589016 1.778828 1.744304 3.130354 3.5438237 3.4850450 2.671672
 [4,] 1.803083 2.355068 1.744304 0.000000 2.734013 2.7340135 1.7062168 1.347963
 [5,] 2.675235 5.216814 3.130354 2.734013 1.795861 5.0437733 3.5401655 4.240574
 [6,] 2.654410 4.715090 3.543824 2.734013 5.043773 0.8340648 3.0384420 3.257953
 [7,] 2.595631 2.697584 3.485045 1.706217 3.540165 3.0384420 0.8340648 3.519340
 [8,] 2.344814 4.014241 2.671672 1.347963 4.240574 3.2579526 3.5193404 1.795861
 [9,] 3.844992 5.450553 3.578348 1.958727 4.344662 3.8221131 3.0937913 5.223821
[10,] 2.761846 2.858712 2.574806 1.764996 3.054180 3.0333548 2.5575437 2.858712
          [,9]     [,10]
 [1,] 3.844992 2.7618462
 [2,] 5.450553 2.8587122
 [3,] 3.578348 2.5748058
 [4,] 1.958727 1.7649955
 [5,] 4.344662 3.0541799
 [6,] 3.822113 3.0333548
 [7,] 3.093791 2.5575437
 [8,] 5.223821 2.8587122
 [9,] 2.806625 3.3310939
[10,] 3.331094 0.8340648
cor(c(jmat), c(dmat))
[1] 0.9875015
cor(c(jmat), c(ilwmat))
[1] -0.007228666
cor(c(dmat), c(ilwmat))
[1] 0.08668629

Important and Influential Nodes: Centrality

Degree Centrality

Degree centrality assigns an importance score based simply on the number of links held by each node.

Simple, straightforward measure of overall connections. Can distinguish between out and in-degree centrality in directed graph.

degree(pilotGraph)
  Luke   Anne    Zee    Bob    Tim Jazmin   Juan Kalani    Liz  Brent 
     7     11      8      5     10      9      7      9     11      7 
degree(pilotGraph, mode="out")
  Luke   Anne    Zee    Bob    Tim Jazmin   Juan Kalani    Liz  Brent 
     3      5      4      2      4      4      4      5      6      5 
degree(pilotGraph, mode="in")
  Luke   Anne    Zee    Bob    Tim Jazmin   Juan Kalani    Liz  Brent 
     4      6      4      3      6      5      3      4      5      2 

Closeness Centrality

The closeness centrality of a vertex v in a connected graph is the inverse of the sum of the distances from v to all other vertices. Closeness centrality is a measure of how efficiently the entire graph can be traversed from a given vertex.

closeness(pilotGraph)
      Luke       Anne        Zee        Bob        Tim     Jazmin       Juan 
0.05882353 0.07142857 0.07142857 0.06250000 0.07142857 0.07142857 0.06666667 
    Kalani        Liz      Brent 
0.07692308 0.08333333 0.07692308 

Betweenness Centrality

The betweenness centrality of a vertex V is calculated by taking each pair of other vertices X and Y, calculating the number of shortest paths between X and Y that go through V, dividing by the total number of shortest paths between X and Y, then summing over all such pairs of vertices in the graph. Betweenness centrality is a measure of how important a given vertex is in connecting other pairs of vertices in the graph… how frequently the vertex lies on shortest paths between any two vertices in the network.

betweenness(pilotGraph)
     Luke      Anne       Zee       Bob       Tim    Jazmin      Juan    Kalani 
 2.983333  9.523810  3.116667  2.852381  5.938095  4.523810  1.904762  5.616667 
      Liz     Brent 
12.207143  3.333333 

Eigenvector Centrality

The Eigenvector centrality or relative centrality or prestige of a vertex is a measure of how connected the vertex is to other influential vertices. EigenCentrality measures a node's influence based on the number of links it has to other nodes in the network. EigenCentrality then goes a step further by also taking into account how well connected a node is, and how many links their connections have, and so on through the network.

eigen_centrality(pilotGraph)$vector
     Luke      Anne       Zee       Bob       Tim    Jazmin      Juan    Kalani 
0.6664506 0.9890806 0.8248114 0.4772630 0.9650966 0.8805495 0.7411246 0.8404030 
      Liz     Brent 
1.0000000 0.6828100 

PageRank

PageRank is a variant of EigenCentrality, also assigning nodes a score based on their connections, and their connections' connections. The difference is that PageRank also takes link direction and weight into account – so links can only pass influence in one direction, and pass different amounts of influence.

page.rank(pilotGraph)$vector
      Luke       Anne        Zee        Bob        Tim     Jazmin       Juan 
0.09260366 0.13238445 0.09814399 0.07592534 0.12618660 0.10777405 0.07780444 
    Kalani        Liz      Brent 
0.10170905 0.12280305 0.06466537 

Global Properties

Network Density

The proportion of all potential edges between vertices that actually exist in the network graph. It is an indicator of how well connected the vertices of the graph are.

edge_density(pilotGraph)
[1] 0.4666667

Average Path Length

The mean of the lengths of the shortest paths between all pairs of vertices in the network.

mean_distance(pilotGraph)
[1] 1.577778

Global Transitivity

Probability that the adjacent verticles of a given vertex are connected (e.g., triangles)

transitivity(pilotGraph)
[1] 0.7880184

Local Transitivity

Metric calculates the proportion of closed triangles that a vertex is part of, given the number of triangles it could be a part of

transitivity(pilotGraph, type="local")
 [1] 0.8095238 0.7222222 0.9333333 0.8000000 0.7857143 0.7857143 0.9333333
 [8] 0.7142857 0.7142857 0.8666667

Cliques

All vertices connected to each other.

largest_cliques(pilotGraph)
[[1]]
+ 6/10 vertices, named, from 8e6f538:
[1] Luke   Anne   Jazmin Tim    Brent  Liz   

[[2]]
+ 6/10 vertices, named, from 8e6f538:
[1] Juan   Anne   Jazmin Tim    Zee    Kalani

[[3]]
+ 6/10 vertices, named, from 8e6f538:
[1] Juan   Anne   Jazmin Tim    Zee    Liz   

Cliques of Any Size

max_cliques(pilotGraph)
[[1]]
+ 4/10 vertices, named, from 8e6f538:
[1] Bob    Luke   Anne   Kalani

[[2]]
+ 5/10 vertices, named, from 8e6f538:
[1] Bob   Luke  Anne  Brent Liz  

[[3]]
+ 5/10 vertices, named, from 8e6f538:
[1] Luke   Anne   Jazmin Tim    Kalani

[[4]]
+ 6/10 vertices, named, from 8e6f538:
[1] Luke   Anne   Jazmin Tim    Brent  Liz   

[[5]]
+ 6/10 vertices, named, from 8e6f538:
[1] Juan   Anne   Jazmin Tim    Zee    Kalani

[[6]]
+ 6/10 vertices, named, from 8e6f538:
[1] Juan   Anne   Jazmin Tim    Zee    Liz   

Special Relationships

Assortativity: Do birds of a feather flock together?

Do similar nodes connect to each other?

You can calculate assortativity based on categories (nominal), numeric values, or degree centrality

assortativity.nominal(UKfaculty, V(UKfaculty)$Group)
[1] 0.7053802

Recriprocity

What is the probability of an edge going both ways?

reciprocity(UKfaculty)
[1] 0.5875153

Community Detection

Where are connections between nodes more dense?

Fast Greedy Algorithm

Only work on undirected graphs

fastgreedy.community(as.undirected(UKfaculty))
IGRAPH clustering fast greedy, groups: 5, mod: 0.56
+ groups:
  $`1`
   [1]  2  8 11 15 18 19 21 25 29 31 34 35 39 41 43 46 57 58 79
  
  $`2`
   [1] 24 32 37 38 48 50 52 54 55 64 70
  
  $`3`
   [1]  1  3  4  9 17 36 44 45 53 59 60 61 62 73 74 75 78 81
  
  $`4`
  + ... omitted several groups/vertices

Betweenness Algorithm

Dividing into smaller pieces until identifies bridges between communities

edge.betweenness.community(UKfaculty)
IGRAPH clustering edge betweenness, groups: 14, mod: 0.12
+ groups:
  $`1`
  [1] 1
  
  $`2`
   [1]  2  3  4  5  7  8  9 10 12 13 16 17 18 19 21 23 25 27 28 29 31 33 35 37
  [25] 38 40 41 42 43 46 48 49 50 51 52 53 54 58 59 60 61 62 65 68 69 70 71 72
  [49] 73 74 75 76 77 78 79 81
  
  $`3`
  [1]  6 22 30 66
  + ... omitted several groups/vertices

Walk Trap Algorithm

x<-walktrap.community(UKfaculty)
length(x)
[1] 6
sizes(x)
Community sizes
 1  2  3  4  5  6 
15  8 25 25  2  6 
membership(x)
 [1] 1 4 1 1 2 3 3 4 5 3 4 3 3 6 4 3 1 4 4 6 4 3 3 2 4 6 3 3 4 3 4 2 3 4 4 1 4 4
[39] 4 3 4 3 4 1 1 4 3 2 3 4 6 2 1 4 2 6 4 4 1 5 1 4 3 2 3 3 2 3 3 4 3 3 1 1 1 3
[77] 3 1 4 6 1
plot(x,UKfaculty)

Plotting

Base Plotting

Layouts

# Plot the graph object pilotGraph in a circle layout
plot(pilotGraph, vertex.label.color = "black", layout = layout_in_circle(pilotGraph))

# Plot the graph object pilotGraph in a Fruchterman-Reingold layout 
plot(pilotGraph, vertex.label.color = "black", layout = layout_with_fr(pilotGraph))

# Plot the graph object pilotGraph in a Tree layout 
m <- layout_as_tree(pilotGraph)
plot(pilotGraph, vertex.label.color = "black", layout = m)

# Plot the graph object pilotGraph using igraph's chosen layout 
m1 <- layout_nicely(pilotGraph)
plot(pilotGraph, vertex.label.color = "black", layout = m1)

Vary Edge Thickness by Edge Weight/Attribute

m1 <- layout_nicely(pilotGraph)
plot(pilotGraph, 
        vertex.label.color = "black", 
        edge.color = 'black',
        edge.width = E(pilotGraph)$friendship,
        layout = m1)

Plotting by Attributes

You can conditionally assign colors based on some attribute and plot based off that attribute. igraph vertices have an attribute for color.

V(pilotGraph)$color <- ifelse(V(pilotGraph)$party == "Dem", "blue", "red")
plot(pilotGraph, vertex.label.color="black")

Plot Cliques

# Assign largest cliques output to object 'lc'
lc <- largest_cliques(pilotGraph)
# Create two new undirected subgraphs, each containing only the vertices of each largest clique.
gs1 <- as.undirected(subgraph(pilotGraph, lc[[1]]))
gs2 <- as.undirected(subgraph(pilotGraph, lc[[2]]))
# Plot the two largest cliques side-by-side
par(mfrow=c(1,2)) # To plot two plots side-by-side
plot(gs1,
     vertex.label.color = "black", 
     vertex.label.cex = 0.9,
     vertex.size = 0,
     edge.color = 'gray28',
     main = "Largest Clique 1",
     layout = layout.circle(gs1)
)
plot(gs2,
     vertex.label.color = "black", 
     vertex.label.cex = 0.9,
     vertex.size = 0,
     edge.color = 'gray28',
     main = "Largest Clique 2",
     layout = layout.circle(gs2)
)

Using a Real Dataset: ‘UK Faculty’

data("UKfaculty")

Can you inspect the UK

Depiction of assortativity

# select colors
colors = brewer.pal(4, "Dark2")
# assign colors to groups
V(UKfaculty)$color = sapply(V(UKfaculty)$Group, function(x) colors[x])

plot(UKfaculty, layout = layout_nicely(UKfaculty, dim = 2),
     vertex.color = V(UKfaculty)$color, vertex.frame.color = NA,
     vertex.label = NA, vertex.shape = 'square',
     vertex.size = 3.5, edge.arrow.size = 0.3, edge.curved = TRUE,
     edge.width = E(UKfaculty)$weight ^ 0.8,
     edge.color = rgb(0, 0, 0, alpha = 0.1))
title("UK Faculty Friendship Network (Directed)", cex.main = 1)